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(54) Title: FUNCTIONAL DOMAINS IN FIAVOBA CTERIUM OKEANOKOTTES (FOKI) RESTRICTION ENDONUCLEASE 
(57) Abstract 

The present inventors have identified the recognition and cleavage H«m«m« of the Fold restriction eodonuclease. Accordingly, the 
present invention relates to DNA segments encoding the recognition and cleavage domains of the Fold restriction eodonuclease, respectively. 
The 41 kDa N-terminal fragment constitutes the Fold recognition domain while die 25 kDa C -terminal fragment constitutes the Fold cleavage 
nuclease domain. The present invention also relates to hybrid restriction enzymes comprising the nuclease domain of the Fold restriction 
eodonuclease linked to a recognition domain of another enzyme. One such hybrid restriction enzyme is Ubx-Fs- This enzyme contains the 
homeo domain of Ubx linked to the cleavage or nuclease domain of Fol± Additionally, the present invention relates to the construction of 
two insertion mutants of Fold endonuclease. 
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5 FUNCTIONAL DOMAINS IN FLAVOBACTERIUM OKEANOKOITES 

(FOKI) RESTRICTION END ONUCL EASE 



10 

BACKGROUND OF THE INVENTION 

1. Field of the Invention; 

15 The present invention relates to the Fokl 

restriction endonuclease system. In particular, the 
present invention relates to DNA segments encoding 
the separate functional domains of this restriction 
endonuclease system. 

20 The present invention also relates to the 

construction of two insertion mutants of Fokl 
endonuclease. 

Additionally, the present invention 
relates to a hybrid enzyme (Ubx-F H ) prepared by 

25 linking the Ultrabithorax Ubx homeo domain to the 
cleavage domain (F N ) of Fokl. 

2. Background Information ; 

30 Type II endonucleases and modification 

methylases are bacterial enzymes that recognize 
specific sequences in duplex DNA. The endonuclease 
cleaves the DNA while the methylases methylate 
adenine or cytosine residues so as to protect the 

35 host-genome against cleavage [Type II restriction 
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and modification enzymes. Iq Nucleases (Eds. Modrich 
and Roberts) Cold Spring Harbor Laboratory/ New 
York, pp. 109-154, 1982]. These restriction- 
modification (R-M) systems function to protect cells 
5 from infection by phage and plasmid molecules that 
would otherwise destroy them. 

As many as 2500 restriction enzymes with 
over 200 specificities have been detected and 
purified (Wilson and Murray. Annu. Rev. Genet. 

10 25:585-627, 1991). The recognition sites of most of 
these enzymes are 4-6 base pairs long. The small 
size of the recognition sites is beneficial as the 
phage genomes are usually small and. these small 
recognition sites occur more frequently in the 

15 phage. 

Eighty different R-M systems belonging to 
the Type IIS class with over 35 specificities have 
been identified. This class is unique in that the 
cleavage site of the enzyme is separate from the 

20 recognition sequence. Usually the distance between 
the recognition site and the cleavage site is quite 
precise (Szybalski et al., Gene, 100:13-26, 1991). 
Among all these enzymes, the Fokl restriction 
endonuclease is the most well characterized member 

25 of the Type IIS class. The Fokl endonuclease 

(RFokl) recognizes asymmetric pentanucleotides in 
double-stranded DNA, 5' GGATG-3' (SEQ ID NO: 1) in 
one strand and 3'-CCTAC-5' (SEQ ID NO: 2) in the 
other, and introduces staggered cleavages at sites 

30 away from the recognition site (Sugisaki et al. , 
Gene 16:73-78; 1981). In contrast, the Fokt 
methylase (MFokl) modifies DNA thereby rendering the 
DNA resistant to digestion by FoJcI endonuclease. 
The Fokl restriction and modification genes have 

35 been cloned and their nucleotide sequences deduced 
(Kita et al., J. of Biol. Chem. . 264:575-5756, 
1989). Nevertheless, the domain structure of the 
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Fokl restriction endonuclease remains unknown, 
although a w three domain structure has been suggested 
(Wilson and Murray, Annu. Rev, Genet. 25:585-627, 
1991) . 

5 SUMMARY OF THE INVENTION 

Accordingly, it is an object of the 
present invention to provide isolated domains of 
Type IIS restriction endonuclease. 

It is another object of the present 
10 invention to provide hybrid restriction enzymes 
which are useful for mapping and sequencing of 
genomes. 

An additional object of the present 
invention is to provide two insertion mutants of 

15 FOKI which have an increased distance of cleavage 
from the recognition site as compared to the wild- 
type enzyme. The polymerase chain reaction (PCR) is 
utilized to construct the two mutants. 

Various other objects and advantages of 

20 the present invention will become obvious from the 
drawings and the following description of the 
invention. 

In one embodiment, the present invention 
relates to a DNA segment encoding the recognition 

25 domain of a Type lis endonuclease which contains the 
sequence-specific recognition activity of the Type 
IIS endonuclease or a DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of the Type IIS 

30 endonuclease. 

In another embodiment, the present 
invention relates to an isolated protein consisting 
essentially of the N-terminus or recognition domain 
of the Fokl restriction endonuclease which protein 

35 has the sequence-specific recognition activity of 
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the endonuclease or an isolated protein consisting 
essentially of the Oterminus or catalytic domain of 
the Fokl restriction endonuclease which protein has 
the nuclease activity of the endonuclease. . 
5 In a further embodiment, the present 

invention relates to a DNA construct comprising a 
first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease; a second DNA 
10 segment encoding a sequence-specific recognition 

domain other than the recognition domain of the Type 
IIS endonuclease; and a vector. In the construct, 
the first DNA segment and the second DNA segment are 
operably linked to the vector to result in the 
15 production of a hybrid restriction enzyme. The 
linkage occurs through a covalent bond. 

Another embodiment of the present 
invention relates to a procaryotic cell comprising a 
first DNA segment encoding the catalytic domain of a 
20 Type IIS endonuclease which contains the cleavage 

activity of said Type IIS endonuclease; a second DNA 
segment encoding a sequence-specific recognition 
domain other than the recognition domain of said 
Type IIS endonuclease; and a vector. The first DNA 
25 segment and the second DNA are operably linked to 
the vector such that a single protein is produced. 
The first DNA segment may encode, for example, the 
catalytic domain (F M ) of Fokl, and the second segment 
may encode, for example, the homeo domain of Ubx. 
30 In another embodiment, the present 

invention relates to a hybrid restriction enzyme 
comprising the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 
the Type IIS endonuclease linked to a recognition 
35 domain of an enzyme or a protein other than the Type 
IIS endonuclease from which the cleavage domain is 
obtained. 



4 



WO 95/09233 PCT/US94/09143 

In a further embodiment, the present 
invention relates to a DNA construct comprising a 
first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
5 activity of the Type IIS endonuclease; a second DNA 
segment encoding a sequence-specific recognition 
domain other than the recognition domain of the Type 
IIS endonuclease; a third DNA segment comprising one 
or more codons, wherein the third DNA segment is 

10 inserted between the first DNA segment and the 

second DNA segment; and a vector. Preferably, the 
third segment contains four or seven codons. 

In another embodiment, the present 
invention relates to a procaryotic cell comprising a 

15 first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease; a second DNA 
segment encoding a sequence-specific recognition 
domain other than the recognition domain of the Type 

20 IIS endonuclease; a third DNA segment comprising one 
or more codons, wherein the third DNA segment is 
inserted between the first DNA segment and the 
second DNA segment; and a vector. The first DNA 
segment and the second DNA segment are operably 

25 linked to the vector so that a single protein is 
produced. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 shows sequences of the 5' and 3' 
30 primers used to introduce new translation signals 

into fokIM and foklR genes during PCR amplification. 
(SEQ ID NOs: 3-9). SD represents Shine-Da lgar no 
consensus RBS for Escherichia coli (E. coli) and 7- 
bp spacer separates the RBS from the ATG start 
35 condon. The fokIM primers are flanked by Ncol 
sites. The foklE primers are flanked by BamHI 
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sites. Start and stop codons are shown in bold 
letters. The 18 -bp complement sequence is 
complementary to the sequence immediately following 

» the stop codon of Mfokl gene. 

5 FIGURE 2 shows the structure of plasmids 

pACYCMf okIM, pKRSR'okIR and pCBfoklR. The PCR- 
modified fokIM gene was inserted at the Ncol site of 
PACYC184 to form pACYCfoklM. The PCR-generated 
fokIR gene was inserted at the BaxoHI sites of pRRS 

10 and pCB to form pRRSfokIR and pCB fokIR, 

respectively. pRRS possesses a lac UV5 promoter and 
pCB contains a strong tac promoter. In addition, 
these vectors contain the positive retroregulator 
sequence downstream of the inserted fokIR gene. 

15 FIGURE 3 shows SDS (0.1%) - polyacrylamide 

(12%) gel electrophoretic profiles at each step in 
the purification of Fokl endonuclease. Lanes: l, 
protein standards; 2, crude extract from uninduced 
cells; 3, crude extract from cells induced with t mM 

20 IPTG; 4, phosphocellulose pool; 5, 50-70% (NH 4 ) 2 S0 4 
fractionation pool; and 6, DEAE pool. 

FIGURE 4 shows SDS (0.1%) - polyacrylamide 
(12%) gel electrophoretic profiles of tryptic 
fragments at various time points of trypsin 

25 digestion of Fokl endonuclease in presence of the 

oligonucleotide DNA substrate, d-S'-CCTCTGGATGCTCTC- 
3 # (SEQ ID NO: 10): 5 9 -GAGAGCATGCAGAGG-3 9 (SEQ ID 
NO: 11). Lanes: 1, protein standards; 2, Fokl 
endonuclease; 3, 2.5 min; 4, 5 min; 5, 10 min; 6, 20 

30 min; 7, 40 min; 8, 80 min; 9, 160 min of trypsin 

digestion respectively. Lanes 10-13: HPLC purified 
tryptic fragments. Lanes: 10, 41 JcDa fragment; 11, 
30 kDa fragment; 12, 11 kDa fragment; and 13, 25 kDa 
fragment. 

35 FIGURE 5 shows the identification of DNA 

binding tryptic fragments of Fokl endonuclease using 
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an oligo dT-cellulose column. Lanes: l, protein 
standards, 2, Fokl endonuclease; 3, 10 min trypsin 
digestion mixture of . Fokl - oligo complex; 4, 
tryptic fragments that bound to the oligo dT- 
5 cellulose column; 5, 160 min trypsin digestion 
mixture of Fokl - oligo complex; 6, tryptic 
fragments that bound to the oligo dT-cellulose 
column. 

FIGURE 6 shows an analysis of the cleavage 
10 properties of the tryptic fragments of Fokl 
endonuclease, 

(A) The cleavage properties of the 
tryptic fragments were analyzed by agarose gel 
electrophoresis. 1 ng of pTZ19R in lOmM Tris.HCl , 

15 (pH 8), 50mM NaCl, ImM DTT, and lOmM MgCl 2 was 

digested with 2 pi of the solution containing the 
fragments (tryptic digests, breakthrough and eluate 
respectively) at 37°C for 1 hr in a reaction volume 
of 10 Lanes 4 to 6 correspond to trypsin 

20 digestion of Fok I- oligo complex in absence of 

MgCl 2 . Lanes 7 to 9 correspond to trypsin digestion 
of Fokl - oligo complex in presence of 10 mM MgCl 2 . 
Lanes: 1, 1 kb ladder; 2, pTZ19R; 3, pTZ19R 
digested with Fokl endonuclease; 4 and 6, reaction 

25 mixture of the tryptic digests of Fokl - oligo 

complex; 5 and 7, 25 kDa C-terminal fragment in the 
breakthrough volume; 6 and 9, tryptic fragments of 
Fokl that bound to the DEAE column. The intense 
bands at bottom of the gel correspond to excess 

30 oligonucleotides. 

(B) SDS (0.1%) - polyacrylamide (12%) gel 
electrophoretic profiles of fragments from the DEAE 
column. Lanes 3 to 5 correspond to trypsin 
digestion of Fokl - oligo complex in absence of 

35 MgCl 2 . Lanes 6 to 8 correspond to trypsin digestion 
of Fokl - oligo complex in presence of 10 mM MgCl 2 . 
Lanes: 1, protein standards; 2, Fokl endonuclease; 
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3 and 6, reaction mixture of the tryptic digests of 
Fokl - oligo complex; 4 and 7, 25 kDa C-terminal 
fragment in the breakthrough volume; .5 and 8, 
■ tryptic fragments of Fokl that bound to the DEAE « 

5 column. 

FIGURE 7 shows an analysis of sequence - 
specific binding of DNA by 41 kDa N-terminal 
fragment using gel mobility shift assays. For the 
exchange reaction, the complex (10 Ml) was incubated 

10 with 1 Ml of 32 P-labeled specific (or non-specific) 
oligonucleotide duplex in a volume of 20 til 
containing 10 mM Tris.HCl, 50 mM NaCl and 10 mM MgCl 2 
at 37 °C for various times. 1 /xl of the S'-^P- 
labeled specific probe [d-5 ' -CCTCTGGATGCTCTC-3 ' (SEQ 

15 ID NO: 10): 5 9 -GAGAGCATCCAGAGG-3 ' (SEQ ID NO: 11) ] 
contained 12 picomoles of the duplex and - 50 x 10 3 
cpm. lfil of the 5'- 32 P-labeled non-specific probe 
[5'-TAATTGATTCTTAA-3' (SEQ ID NO: 12) :5'- 
ATTAAGAATCAATT-3' (SEQ ID NO: 13) ] contained 12 

20 picomoles of the duplex and - 25 x 10 3 cpm. (A) 
Lanes: 1, specific oligonucleotide duplex; 2, 41 
kDa N-terminal fragment-oligo complex; 3 and 4, 
specific probe incubated with the complex for 30 and 
120 min respectively. (B) Lanes: 1, non-specific 

25 oligonucleotide duplex; 2, 41 kDa N-terminal 

fragment-oligo complex; 3 and 4 non-specific probe 
incubated with the complex for 30 and 120 min 
respectively. 

FIGURE 8 shows SDS (0.1%) polyacrylamide 

30 (12%) gel electrophoretic profiles of tryptic 
fragments at various time points of trypsin 
digestion of Fokl endonuclease. The enzyme (200 Mg) 
in a final volume of 200 /xl containing 10 mM 
Tris.HCl, 50 mM NaCl and lOmM MgCl 2 was digested with 

35 trypsin at RT. The trypsin to Fokl ratio was 1:50 
by weight. Aliquots (28 /xl) from the reaction 
mixture removed at different time intervals and 
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quenched with excess antipain. .Lanes: 1, protein 
standards; 2, Fokl endonuclease; 3, 2.5 min; 4, 5.0 - 
min; 5, 10 min; 6, 20 min; 7, 40 min; 8, 80 min; and 
9,160 min of trypsin digestion respectively. 
5 FIGURE 9 shows the tryptic map of Fokl 

endonuclease (A) Fokl endonuclease fragmentation 
pattern in absence of the oligonucleotide substrate. 
(B) Fokl endonuclease fragmentation pattern in 
presence of the oligonucleotide substrate. 

10 FIGURE 10 shows the predicted secondary 

structure of FoJcI based on its primary sequencing 
using the PREDICT program (see SEQ ID NO: 31). The 
trypsin cleavage site of Fokl in the presence of DNA 
substrates is indicated by the arrow. The , 

15 KSELEEKKSEL segment is highlighted. The symbols are 
as follows: h, helix; s, sheet; and # , random coil. 

FIGURE 11 shows the sequences of the 5' 
and 3' oligonucleotide primers used to construct the 
insertion mutants of Fokl (see SEQ ID NO: 32; SEQ ID 

20 NO:33, SEQ ID N0:34, SEQ ID N0:35, SEQ ID NO:36, SEQ 
ID NO:37, SEQ ID NO:38 and SEQ ID NO:39, 
respectively) . The four and seven codon inserts are 
shown in bold letters. The amino acid sequence is 
indicated over the nucleotide sequence. The same 3' 

25 primer was used in the PCR amplification of both 
insertion mutants. 

FIGURE 12 shows the SDS/PAGE profiles of 
the mutant enzymes purified to homogeneity. Lanes: 
1, protein standards; 2, Fokl; 3, mutant Fokl with 

30, 4-codon insertion; and 4, mutant Fokl with 7-codon 
insertion. 

FIGURE 13 shows an analysis of the DNA 
sequence specificity of the mutant enzymes. The DNA : 
substrates were digested in 10 mH Tris HC1, pH 
35 8.0/50 mM NaCl/1 mM DTT/lOmM MgCl 2 at 37°C for 2 
hrs. 
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(A) Cleavage pattern of pT219R DNA 
substrate ^analyzed by 1% agarose gel 
electrophoresis. 2/xg of pTZ19R DNA was used in each 
reaction. Lanes: 1, 1-kilobase (kb) ladder; 2, 

5 pTZ19R; 3, pTZ19R digested with Fokl; pTZ19R 

digested with mutant FoJcI with 4-codon insertion; 
aiid 5, pTZ19R digested with mutant Fokl with 7-codon 
insertion. 

(B) Cleavage pattern of 256 bp DNA 

10 substrate containing a single Fokl site analyzed by 
1.5% agarose gel electrophoresis, ljxg of 
radiolabeled substrates ("p-labeled on individual 
strands) was digested as described above. The 
agarose gel was stained with ethidium bromide and , 

15 visualized under UV light. Lanes 2 to 6 correspond 
to the 32 P-labeled substrate in which the 5'-CATCC-3' 
strand is labeled. Lanes 7 to 11 correspond to 
the substrate in which the 5'-GGATG-3' strand is 32p- 
labeled. Lanes: 1, lkb ladder; 2 and 7, 32 P-labeled 

20 250 bp DNA substrates; 3 and 8, 32 -P labeled 

substrates cleaved with Fokl; 4 and 9, purified the 
laboratory wild-type Fokl; 5 and 10, mutant Fokl 
with 4-codon insertion; 6 and 11, mutant FoJcI with 
7-codon inserticr. 

25 (C) Auroradiograph of the agarose gel 

from above. Lanes: 2 to 11, same as in B. 

FIGURE 14 shows an analysis of the 
distance of cleavage from the recognition site by 
Fokl and the mutant enzymes. The unphosphorylated 

30 oligonucleotides were used for dideoxy DNA 

sequencing with pTZ19R as the template. The 
sequencing products (G, A, T, C) were 
electrophoresed on a 6% acrylamide gel containing 7M 
urea, and the gel dried. The products were then 

35 exposed to an x-ray film for 2 hrs. Cleavage 
products from the 100 bp and the 256 bp DNA 
substrates are shown in A and B, respectively. I 
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corresponds to substrates containing 32p-i a bei on the 
5'-GGATG-3' strand, and II corresponds to substrates 
containing 3 2p-label on the 5'-CATCC-3' strand. 
Lanes: 1, Fokl; 2, Fokl ; 3, mutant Fokl with 4- 
5 codon insertion; and 4, mutant Fokl with 7-codon 
insertion, 

FIGURE 15 shows a map of the cleavage 
site(s) of Fokl and the mutant enzymes based on the 
100 bp DNA substrate containing a single Fo&I site: 

10 (A) wild-type Fokl; (B) mutant Fokl with 4 -codon 
insertion; and (C) mutant Fo/cl with 7-codon 
insertion (see SEQ ID NO: 40). The sites of cleavage 
are indicated by the arrows. Major cleavage sites 
are shown by larger arrows. 

15 FIGURE 16 represents a diagram showing the 

orientation of the Ubx homeo domain with respect to 
the FoicI nuclease domain (F N ) in relation to the DNA 
substrate. The crystal structure of an engrailed 
homeo domain - DNA complex was reported by Kissinger 

20 et al. (Ceil 63: 579-90 (1990)). 

FIGURE 17 shows the construction of 
expression vectors of the Ubx-F H hybrid enzyme. (A) 
Sequences of the 5 ' and 3 '" primers used to construct 
the hybrid gene, Ubx-F H . The Ubx primers are flanked 

25 by PstI and Spel sites (see SEQ ID NO: 41 and SEQ ID 
NO: 42). The Ubx-F^ primers are flanked by Ndel and 
BamKI sites (see SEQ ID NO:43 and SEQ ID NO:44). 
Start and stop codons are shown in boldface letters. 
(B) Structure of plasmids, pRRS Ubx-F H and pET-15b 

30 Ubx-F H . The PCR modified Ubx homeo box was 

substituted for the Pstl/Spel fragment of pBRSfokIR 
to generate pRRS Ubx-F u * The PCR-generated fragment 
using Ubx-F u primer is was inserted at the BamKI/Ndel 
sites of pET-15b to form pET-15b Ubx-F^. 

35 FIGURE 18 represents SDS/PAGE profiles at 

each step in the purification of the Ubx-F H hybrid 
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enzyme. Lanes: 1, protein standards; 2, crude 
extract from induced cells; 3, His-bind™ resin pool; ... 
4 0 phosphocellulose pool; and 5, DEAE pool. 

FIGURE 19 shows a characterization of the 
5 Ubx-F^ hybrid protein using the linearized pUC13 DNA 

substrates containing Ubx site(s). (A) pUC13 , 
derived DNA substrates. , D:30 bp insert containing .i » 

the Ubx site, 5 9 -TTAATGGTT-3 9 • The number of tandem 
repeats of the 30 bp insert in these substrates are 

10 shown in brackets. The orientation of the Ubx 

site(s) are indicated by the arrows. (B) The DNA 
substrate (1 fig) was partially digested in buffer 
containing 20 mM Tris. HC1 (pH 7.6), 75 inM KC1, 1 mM 
DTT, 50 Mg/ml BSA, 10% glycerol, 100 mg/ml tRNA and 

15 2 mM MgCl 2 at 31°C for 4-5 hrs. The products were 
analyzed by 1% agarose gel electrophoresis. The 
substrate was present in large excess compared to 
the Ubx-F H hybrid protein (~ 100:1). The reaction 
condition was optimized to yield a single double- 

20 stranded cleavage per substrate molecule. The 

reaction proceeds to completion upon increasing the 
enzyme concentration or by digesting overnight at 
3l°C (data not shown). The two fragments, -1.8 kb 
and -0.95 kb, respectively, resulting from the 

25 binding of the hybrid enzyme at the newly inserted 
Ubx site of pUC13 and cleaving near this site, are 
indicated by the arrows. 

FIGURE 20 shows an analysis of the 
distance of cleavage from the recognition site by 

30 Ubx-F^. The cleavage products of the 32 P-labeled DNA 
substrate containing a single Ubx site by Ubx-F^ 
along with (G + A) Maxam-Gilbert sequencing 
reactions were separated by electrophoresis on a 6% 
polyacrylamide gel containing 6M urea, and the gel 

35 was dried and exposed to an x-ray film for 6 hrs. 
(A) corresponds to cleavage product (s) from a 
substrate containing 32 P-label on the 5 9 -TAAT-3 9 

12 
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strand (see SEQ ID NO:45). Lanes: 1, (G + A) 
sequencing reaction; and 2, Ubx-F H . (B) corresponds 
to, a substrate containing ^P-label on the 
complementary strand, 5'-ATTA-3' (see SEQ ID NO: 46) • 
5 Lanes: 1,(6 + A) sequencing reaction; 2, Ubx-F u . 

(C) A map of the cleavage site(s) of Ubx-F H based on 
the DNA substrate containing a single Ubx site. The 
recognition site is shown by outline letters. The 
site(s) of cleavage are indicated by the arrows. 
10 The purine residues are indicated by * (see SEQ ID 
NO: 47 and SEQ ID NO:48). 

DETAILED DESCRIPTION OF THE* INVENTION 

The present invention is based on the 
identification and characterization of the 

15 functional domains of the Fokl restriction 

endonuclease. In the experiments resulting in the 
present invention, it was discovered that the Fokl 
restriction endonuclease is a two domain system, one 
domain of which possesses the sequence-specific 

20 recognition activity while the other domain contains 
the nuclease cleavage activity. 

The Fokl restriction endonuclease 
recognizes the non-palindromic pentanucleotide 5'- 
GGATG-3 9 (SEQ ID NO: 1) : 5'-CATCC-3 ' (SEQ ID NO:2) in 

25 duplex DNA and cleaves 9/13 nucleotides downstream 
from the recognition site. Since 10 base pairs are 
required for one turn of the DNA helix, the present 
inventor hypothesized that the enzyme would interact 
with one face of the DNA by binding at one point and 

30 cleave at another point on the next turn of the 

helix. This suggested the presence of two separate 
protein domains, one for sequence-specific 
recognition of DNA and one for endonuclease 
activity. The hypothesized two domain structure was 

35 shown to be the correct structure of the Fokl 
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endonuclease system by studies that resulted in the 
present invention. 

, Accordingly, in one embodiment, the 

present invention relates to a DNA segment which . 
encodes the N-terminus of the Fokl restriction 
endonuclease (preferably, about the N-terminal 2/3's 
of the protein) . This DNA segment encodes a protein 
which has the sequence-specific recognition activity 
of the endonuclease, that is, the encoded protein 
recognizes the non-pal indromic pentanucleotide d-5'- 
GGATG-3'(SEQ ID NO: 1) :5'-CATC03 ' (SEQ ID NO: 2) in 
duplex DNA. Preferably, the DNA segment of the 
present invention encodes amino acids 1-382 of the 
Fokl endonuclease. 

In a further embodiment, the present 
invention relates to a DNA segment which encodes the 
C-terminus of the Fokl restriction endonuclease. 
The protein encoded by this DNA segment of the 
present invention has the nuclease cleavage activity 
of the Fokl restriction endonuclease. Preferably, 
the DNA segment of the present invention encodes 
amino acids 383-578 of the Fokl endonuclease. DNA 
segments of the present invention can be readily 
isolated from biological samples using methods known 
in the art, for example, gel electrophoresis, 
affinity chromatography, polymerase chain reaction 
(PCR) , or a combination thereof. Further, the DNA 
segments of the present invention can be chemically 
synthesized using standard methods in the art. 

The present invention also relates to the 
proteins encoded by the DNA segments of the present 
invention. Thus, in another embodiment, the present 
invention relates to a protein consisting 
essentially of the N-terminus of the Fokl 
endonuclease which retains the sequence-specific 
recognition activity of the enzyme. This protein of 
the present invention has a molecular weight of 
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about 41 kilodaltons as determined by SDS 
polyacrylamide gel electrophoresis in the presence 
of 2-mercaptoethanol. 

In a further embodiment, the present 
5 invention relates to a protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease (preferably, the C-terminal 
1/3 of the protein). The molecular weight of this 
protein is about 25 kilodaltons as determined by 

10 SDS/polyacrylamide gel electrophoresis in the 
presence of 2-mercaptoethanol. 

The proteins of the present invention can 
be isolated or purified from a biological sample 
using methods known in the art. For example, the 

15 proteins can be obtained by isolating and cleaving 
the Fokl restriction endonuclease. Alternatively, 
the proteins of the present invention can be 
chemically synthesized or produced using recombinant 
DNA technology and purified. 

20 The DNA segments of the present invention 

can be used to generate 'hybrid' restriction enzymes 
by linking other DNA binding protein domains with 
the nuclease or cleavage domain of Fokl. This can 
be achieved chemically as well as by recombinant DNA 

25 technology. Such chimeric hybrid enzymes have novel 
sequence specificity and are useful for physical 
mapping and sequencing of genomes of various 
species, such as, humans, mice and plants. For 
example, such enzymes would be suitable for use in 

30 mapping the human genome. These engineered hybrid 
endonucleases will also facilitate the manipulation 
of genomic DNA and provide valuable information 
about protein structure and protein design. 

Such chimeric enzymes are also valuable 

35 research tools in recombinant DNA technology and 
molecular biology. Currently only 4-6 base pair 
cutters and a few 8 base pair cutters are available 
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'commercially. (There are about 10 endonucleases 
which cut >6 base pairs that are available 
commercially. ) By linking other DNA binding 
proteins to the nuclease domain of Fokl new enzymes 
can be generated that recognize more than 6 base 
pairs in DNA. 

Accordingly, in a further embodiment/ the 
present invention relates to a DNA construct and 
the hybrid restriction enzyme encoded therein. The 
DNA construct of the present invention comprises a 
first DNA segment encoding the nuclease domain of 
the Fokl restriction endonuclease, a second DNA 
segment encoding a sequence-specific recognition 
domain and a vector . The f ir st DNA segment and the 
second DNA segment are operably linked to the vector 
so that expression of the segments can be effected 
thereby yielding a chimeric restriction enzyme. The 
construct can comprise regulatory elements such as 
promoters (for example, T7, tac, trp and lac UV5 
promoters) , transcriptional terminators or 
retroregulators (for example, stem loops). Host 
cells (procaryotes such as E. coli) can be 
transformed with the DNA constructs of the present 
invention and used for the production of chimeric 
restriction enzymes. 

The hybrid enzymes of the present 
invention are comprised of the nuclease domain of 
FoJtl linked to a recognition domain of another 
enzyme or DNA binding protein (such as, naturally 
occurring DNA binding proteins that recognize 6 base 
pairs). Suitable recognition domains include, but 
are not limited to, the recognition domains of zinc 
finger motifs; homeo domain motifs; POU domains 
(eukaryotic trnscription regulators, e.g., Pitl, 
Octl, Oct2 and unc86) ; other DNA binding protein 
domains of iamJbda repressor, lac repressor, cro, 
gal4; DNA binding protein domains of oncogenes such 

16 
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as myc, jun; and other naturally occurring sequence- 
specific DNA binding proteins that recognize >6 base - 

* o * 

pairs. 

The hybrid restriction enzymes of the 
5 present invention can be produced by those skilled 

in the art using known methodology. For example , » 
the enzymes can be chemically synthesized or ,i i 

produced using recombinant DNA technology well known 
in the art. The hybrid enzymes of the present 

10 invention can be produced by culturing host cells 
(such as, HB101, RR1, RB791 and MM294) containing 
the DNA construct of the present invention and 
isolating the protein. Further, the hybrid enzymes 
can be chemically synthesized, for example, by 

15 linking the nuclease domain of the FoJcI to the 
recognition domain using common linkage methods 
known in the art, for example, using protein cross- 
linking agents such as EDC/NHS, DSP, etc. 

One particular hybrid enzyme which can be 

20 created according to the present invention and, 

thus, an embodiment of the present invention is Ubx- 
F M . The chimeric restriction endonuclease can be 
produced by linking the Ubx homeo domain to the 
cleavage domain (F N ) of Fo*I. Subsequent to 

25 purification, the properties of the hybrid enzyme 
were analyzed. 

While the Fokl restriction endonuclease 
was the enzyme studied in the following experiments, 
it is expected that other Type IIS endonucleases 

30 (such as, those listed in Table 2) will function 
using a similar two domain structure which one 
skilled in the art could readily determine based on 
the present invention. 

Recently, StsI, a heteroschizomer of FoJcI 

35 has been isolated from Streptococcus sanguis (Kita 
et al., Nucleic Acids Research 20 (3)) 618, 1992). 
StsI recognizes the same nonpalindromic 
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1 pentadeoxyr ibonucleotide 5 ' -GGATG-3 ' : 5 ' -CATCC-3 ' as 
Fokl but cleaves 10/14 nucleotides downstream of the 
recognition site. The St si RM system has been 
cloned and sequenced (Kita et al., Nucleic Acids 
5 Research 20 (16) 4167-72, 1992). Considerable amino 
acid sequence homology (-30%) has been detected 
between the endonucleases, Fokl and Stsl. 

Another embodiment of the invention 
relates to the construction of two insertion mutants 
10 of Fokl endonuclease using the polymerase chain 

reaction (PGR). In particular, this embodiment 
includes a DNA construct comprising a first DNA 
segment encoding the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 

15 the Type IIS endonuclease, a second DNA segment 
encoding a sequence-specific recognition domain 
other than the recognition domain of the Type IIS 
endonuclease, and a third DNA segment comprising one 
or more codons. The third DNA segment is inserted 

20 between the first DNA segment and the second DNA 
segment. The construct also includes a vector. 
The Type IIS endonuclease is Fokl restriction 
endonuclease. 

Suitable recognition domains include, but 

25 are not limited to, zinc finger motifs, homeo domain 
motifs, POU domains, DNA binding domains of 
repressors, DNA binding domains of oncogenes and 
naturally occurring sequence-specific DNA binding 
proteins that recognize >6 base pairs. 

30 As noted above, the recognition domain of 

FoJfcl restriction endonuclease is at the amino 
terminus of Fold endonuclease, whereas the cleavage 
domain is probably at the carboxyl terminal third of 
the molecule. It is likely that the domains are 

35 connected by a linker region, which defines the 
spacing between the recognition and the cleavage 
sites of the DNA substrate. This linker region of 
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Fokl is susceptible to cleavage by trypsin in the 
presence of a DNA substrate yielding a 41-kDa amino- 
terminal fragment (The DNA binding domain) and a 25- 
1 kDa carboxyl-terminal fragment (the cleavage 

5 domain). Secondary structure prediction of Fokl 

■ » 

endonuclease based on its primary amino acid 
sequence supports this hypothesis (see Figure 10) . 
The predicted structure reveals a long stretch of 
alpha helix region at the junction of the 

10 recognition and cleavage domains. This helix 

probably constitutes the linker which connects the 
two domains of the enzyme. Thus, it was thought 
that the cleavage distance of Fokl from the 
recognition site could be altered by changing the 

15 length of this spacer (the alpha helix). Since 3.6 
amino acids are required to form one turn of the 
alpha helix, insertion of either four codons or 
seven codons in this region would extend the pre- 
existing helix in the native enzyme by one or two 

20 turns, respectively. Close examination of the amino 
acid sequence of this helix region revealed the 
presence of two KSEL repeats separated by amino 
acids EEK (Figure 10) (see SEQ ID NO: 21). The 
segments KSEL (4 codons) (see SEQ ID N0:22) and 

25 KSELEEK (7 codons) (see SEQ ID NO: 23) appeared to be 
good choices for insertion within this helix in 
order to extend it by one and two turns, 
respectively, (See Examples X and XI.) Thus, 
genetic engineering was utilized in order to crea: 2 

30 mutant enzymes. 

In particular, the mutants are obtained by 
inserting one or more, and preferably four or seven, 
codons between the recognition and cleavage domains 
of Fokl. More specifically, the four or seven 

35 codons are inserted at nucleotide 1152 of the gene 
encoding the endonuclease. The mutants have the 
same DNA sequence specificity as the wild-type 
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enzyme. However, they cleave one nucleotide further 
away from the recognition site on both strands of 
the DNA substrates as compared to the wild-type 
enzyme. 

5 Analysis of the cut sites of Fokl and the 

mutants, based on the cleavage of the 100 bp 
fragment, is summarized in Figure 15. Insertion of 
f our (or seven) codons between the recognition and 
cleavage domains of Fokl is accompanied by an 

10 increase in the distance of cleavage from the 

recognition site. This information further supports 
the presence of two separate protein domains within 
the Fokl endonuclease: one for the sequence 
specific reco»gnition and the other for the 

15 endonuclease activity. The two domains are 

connected by a linker region which defines the 
spacing between the recognition and the cleavage 
sites of the DNA substrate. The modular structure 
of the enzyme suggests it may be feasible to 

20 construct chimeric endonucleases of different 

sequence specificity by linking other DNA-binding 
proteins to the cleavage domain of the Fokl 
endonuclease. 

In view of the above-information, another 

25 embodiment of the invention includes a procaryotic 
cell comprising a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of the Type IIS 
endonuclease, a second DNA segment encoding a 

30 sequence-specific recognition domain other than the 
recognition domain of the Type IIS endonuclease, and 
a third DNA segment comprising one or more codons. 
The third DNA segment is inserted between the first 
DNA segment and the second DNA segment. The cell 

35 also includes a vector. Additionally, it should be 
noted that the first DNA segment, the second DNA 
segment, and the third DNA segment are operably 
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'linked to the vector so that a single protein is 
produced. The third segment may consist essentially 

of four or seven codons. 

i • - 

i. The present invention also includes th^ 

5 protein produced by the procaryotic cell referred to 
directly above. In particular, the isolated protein 
consists essentially of the recognition domain of 
the Fokl restriction endonuclease, the catalytic 
domain of the Fokl restriction endonuclease, and 

10 amino acids encoded by the codons present in the 
third DNA segment. 

The following non-limiting Examples are 
provided to describe the present invention in 
greater detail. 

15 EXAMPLES 

The following materials and methods were 
utilized in the isolation and characterization of 
the Fokl restriction endonuclease, functional domains 
as exemplified here inbe low. 

20 Bacterial strains and plasmids 

Recombinant plasmids were transformed into 
E.coli RB791 19 cells which carry the lac i fl allele 
on the chromosome (Brent and Ptashne, PNAS USA . 
78:4204-4208, 1981) or E.coli RR1 cells. Plasmid 

25 pACYCfo/cIM is a derivative of pACYC184 carrying the 
PCR-generated fokIM gene inserted into Ncol site/ 
The plasmid expresses the Fokl methylase 
constitutively and was present in RB791 cells (or 
RR1 cells) whenever the fokIR gene was introduced on 

30 a separate compatible plasmid. The Fokl methylase 
modifies Fokl sites and provides protection against 
chromosomal cleavage. The construction of vectors 
pRRS and pCB are described elsewhere (Skoglund et 
al., Gene , 88:1-5, 1990). 
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Enzymes, biochemicals* and oliaos 
piigo primers for PCR were synthesized 
with an Applied Biosystem DNA synthesizer using 
cyanoethyl phosphoramidite chemistry and purified by 
reversed phase HPLC. Restriction enzymes were 
purchased from New England Biolabs. The DNA ligase 
IPTG were from Boehringer-Mannheim. PCR reagents 
were purchased as a Gene Amp Kit from Perkin-Elmer . 
Plasmid purification kit was from QIAGEN. 

Restriction enzvme assays 
Cells from a 5-ml sample of culture 
medium were harvested by centrifugation, resuspended 
in 0.5 ml sonication buffer [50 mM Tris.HCl (pH 8), 
14mM 2-mercaptoethanol] , and disrupted by sonication 
(3x5 seconds each) on ice. The cellular debris 
was centrifuged and the crude extract used in the 
enzyme assay. Reaction mixtures (10 nl) contained 
lOmM Tris.HCl (pH 8), 10 mM MgCl 2 , 7 mM 2- 
mercaptoethanol, 50 /xg of BSA, 1 fig of plasmid 
pTZ19R (U.S. biochemicals) and 1^1 of crude enzyme. 
Incubation was at 37°C for 15 min. tRNA (10 pq) was 
added to the reaction mixture^ when necessary to 
inhibit non-specific nucleases. After digestion, 
1 Ml of dye solution (100 mM EDTA, 0.1% bromophenol 
blue, 0.1% xylene cyanol, 50% glycerol) was added, 
and the samples were electrophoresed on a 1% agarose 
gel. Bands were stained with 0.5 nq ethidium 
bromide/ml and visualized with 310-nm ultraviolet 
light. 

SDS/PAGE 

Proteins were prepared in sample 
buffer and electrophoresed in SDS (0.1%)- 
polyacrylamide (12%) gels as described by Laemmli 
(Laemmli, Nature , 222:680-685, 1970). Proteins were 
stained with coomassie blue. 
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Example I 
Cloning of Fokl R M system 
■ The Fokl system was cloned by selecting 

for the modification phenotype. Flavobacterium 
5 okeanokoites strain DNA was isolated by the method 
described by Caserta et al. (Caserta et al. . J. 
Biol. Chem. . 262:4770-4777, 1987). Several 
Flavobacterium okeanokoites genome libraries were 
constructed in plasmids pBR322 and pUC13 using the 

10 cloning enzymes PstJ, BamHI and Bgl II. Plasmid 

library DNA (10 /xg) was digested with 100 units of 
Fokl endonuclease to select for plasmids expressing 
fokIM+ phenotype. 

Surviving plasmids were transformed into 

15 RR1 cells and transf ormants were selected on plates 
containing appropriate antibiotic. After two rounds 
of biochemical enrichment, several plasmids 
expressing the fokIM+ phenotype from these libraries 
were identified. Plasmids from these clones were 

20 totally resistant to digestion by Fokl. 

Among eight transf ormants that were 
analyzed from the F. okeanokoites pBR322 PstJ 
library, two appeared to carry the fokIM gene and 
plasmids from these contained a 5.5 kb PstJ 

25 fragment. Among eight transf ormants that were 

picked from P. okeanokoites pBR322 BamHI library, 
two appeared to carry the fokIM gene and their 
plasmids contained ~ 18 kb BamHI fragment. Among 
eight transf ormants that were analyzed from the P. 

30 okeanokoites genome Bglll library in pUC13, six 

appeared to carry the fokIM gene. Three of these 
clones had a 8 kb Bglll insert while the rest 
contained a 16 kb Bgl I I fragment. 

Plating efficiency of phage x on these 

35 clones suggested that they also carried the fokIR 
gene. The clones with the 8-kb Bglll insert . 
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•appeared to be most resistant to phage infection. 
^Furthermore , the Fokl endonuclease activity Was 
detected in the crude extract of this clone after 

, partial purification on a phosphocellulose column. 

5 The plasmid, pVCfokIRM from this, clone was chosen 
for further characterization. 

The 5.5 kb PstJ fragment was transferred 
to M13 phages and the nucleotide sequences of parts 
of this insert determined using Sanger's sequencing 

10 method (Sanger et al., PNAS USA , 74:5463-5467. 

1977) . The complete nucleotide sequence of the FoJtl 
RM system has been published by other laboratories 
(Looney et al., Gene, 80:193-208, 1989; Kita et al., 
Nucleic Acid Res, , 17:8741-8753, 1989; Kita et al., 

15 J, Biol, Chem. 264:5751-5756. 19891. 

Example IX 

Construction of an efficient overproducer 
clone of Fokl endonuclease using 
polymerase cfoij-n reaction, 
20 The PCR technique was used to alter 

transcriptional and translational signals 
surrounding the fokIR gene so as to achieve 
overexpression in E.coli (Skoglund et al., Gene . 
88:1-5, 1990). The ribosome-binding site preceding 
25 the fokIR and fokIM genes were altered to match the 
consensus E. coli signal. 

In the PCR reaction, plasmid pVCfokIRM DNA 
linearized with BamHI was used as the template. PCR 
reactions (100 pi) contained 0.25 nmol of each 
30 . primer, 50 /xM of each dNTP, 10 mM Tris.HCl (pH 8.3 
at 25°C), 50 mM KC1, 1.5 mM MgCl 2 0.01% (W/V) 
gelatin, 1 ng of template DNA, 5 units of Tag DNA 
polymerase. The oligo primes used for the 
amplification of the fokIR and fokIM genes are shown 
35 in Figure 1. Reaction mixtures (ran in 
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quadruplicate) were overlayed with mineral oil and 
reactions were carried out using Perkin-Elmer-Cetus 
Thermal Cycler. 

Initial template denaturation was 
programmed for 2 min. Thereafter, the cycle profile 
was programmed as follows: 2 min at 37 °C , 
(annealing) , 5 min at 72°C (extension) , and 1 min at 
94 °C (denaturation). This profile was repeated for 
25 cycles and the final 72°C extension was increased 
to 10 min. The aqueous layers of the reaction 
mixtures were pooled and extracted once with 1:1 
phenol/chloroform and twice with chloroform. The 
DNA was ethanol-precipitated and resuspehded in 20 
Ml TE buffer [10 mM Tris.HCl, (pH 7.5), 1 mM EDTA] . 
The DNA was then cleaved with appropriate 
restriction enzymes to generate cohesive ends and 
gel-purified. 

The construction of an over-producer clone 
was done in two steps • First, the PCR-generated DNA 
containing the fokIM gene was digested with Ncol and 
gel purified. It was then ligated into Nco J-cleaved 
and dephosphorylated pACYC184 and the recombinant 
DNA transfected into E.coli RB791 i 9 or RR1 cells 
made competent as described by Maniatis et al 
(Maniatis et al., Molecular Cloning. A laboratory 
manual Cold Soring Harbo r Laboratory, Cold Spring 
Harbor. NY . 1982). After Tc selection, several 
clones were picked and plasmid DNA was examined by 
restriction analysis for the presence of fokIM gene 
fragment in correct orientation to the 
chloramphenicol promoter of the vector (see figure 
2 ) . This plasmid expresses FoJcI methylase 
const itutively, and this protects the host from 
chromosomal cleavage when the fokIR gene is 
introduced into the host on a compatible plasmid. 
The plasmid DNA from these clones are therefore 
resistant to Fokl digestion. 
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Second, the PCR-generated fokIR fragment 
was ligated into BajnHI-cleaved and dephosphorylated 
high expression vectors pRRS or pCB. pRRS possesses 
a lac UV5 promoter and pCB containing the strong tac 
5 promoter; In addition, these vectors contain the 
positive retroregulator stem-loop sequence derived 
from the crystal protein-encoding gene of Bacillus 
Thuringiensis downstream of the inserted fokIR gene* 
The recombinant DNA was trans fected into competent 

10 E.coli RB791 i 9 [pACYCfofcJ/f] or RRl[pACYCfoJtJM] cells. 
After Tc and Ap antibiotic selection, several clones 
were picked and plasmid DNA was examined by 
restriction analysis for fokIR gene fragment in 
correct orientation for expression from the vector 

15 promoters. These constructs were then examined for 
enzyme production. 

To produce the enzyme, plasmid-containing 
RB791 i 9 or RR1 cells were grown at 37 Q C with shaking 
in 2x concentrated TY medium [1.6% tryptone, 1% 

20 yeast extract, 0.5% NaCl (pH 7.2) ] supplemented with 
20 ixq Tc/ml (except for the pVCfqkIRM plasmid) and 
50 fxg Ap/ml. IPTG was added to a concentration of 1 
mM when the cell density reached O.D.^q * 0.8. The 
cells were incubated overnight (I2.hr) with shaking. 

25 As is shown in Figure 2, both constructs yield Fokl 
to a level of 5-8% of the total cellular protein. 

Example III 
Purification of FoKI endonuclease 
A simple three-step purification procedure 
30 was used to obtain electrophoretically homogeneous 
Fokl endonuclease. RR1 [pkCYQfokIM, pKRSfokIR] were 
grown in 6L of 2 x TY containing, 20/xg Tc/ml and 50 
M9/Ap ml at 37 °C to k m =0.8. and then induced 
overnight with 1 mM IPTG. The cells were harvested 
35 by centrifugation and then resuspended in 250 ml of 
buffer A [10 mM Tris. phosphate (pH 8.0), 7 mM 2- 
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mercaptoethanol, 1 mM EDTA, 10% glycerol] containing 
50 mM NaCl. 

The cells were disrupted at maximum 
intensity on a Branson Sonicator for 1 hr at 4°C. 
5 The sonicated cells were centrifuged at 12,000 g for * 
2 hr at 4°C. The supernatant was then diluted to 1L 
with buffer A containing 50 mM NaCl, The 
supernatant was loaded onto a 10 ml phosphocellulose 
(Whatman) column pre-equilibrated with buffer A 

10 containing 50 mM NaCl. The column was washed with 
50 ml of loading buffer and the protein was eluted 
with a 80-ml total gradient of 0.05M to 0.5M NaCl in 
buffer A. The fractions were monitored by A^ 
absorption and analyzed by electrophoresis on SDS 

15 (0.1%)-polyacrylamide (12%) gels (Laemmli, Nature , 
222:680-685, 1970). Proteins were stained with 
coomassie blue. 

Restriction endonuclease activity of the 
fractions were assayed using pTZ19R as substrate. 

M 

20 The fractions containing Fo*I were pooled and 

fractionated with ammonium sulfate. The 50-70% 
ammonium sulfate fraction contained the Fo/cl 
endonuclease. The precipitate was resuspended in 50 
ml of buffer A containing 25 mM NaCl and loaded onto 

25 a DEAE column. Fokl does not bind to DEAE while 
many contaminating proteins do. The flow-through 
was concentrated on a phosphocellulose column. 
Further purification was achieved using gel 
filtration (AcA 44) column. The Fokl was purified 

30 to electrophoretic homogeneity using this procedure. 

SDS (0.1%) polyacrylamide (12%) gel 
electrophoresis profiles of protein species present 
at each stage of purification are shown in Figure 3. 
The sequence of the first ten amino acids of the 

35 purified enzyme was determined by protein 

sequencing. The determined sequence was the same as 
that predicted from the nucleotide sequence. 

27 



WO 95/09233 . PCT/US94/09143 

'Crystals of this purified enzyme have also been, 
grown using PEG 4000 as the precipitant. Fokl 
endonuclease, was purified further using AcA44 gel 
• filtration column. 

5 Example IV 

Analysis of FokIR endonuclease by 

trypsin cleavage in the presence of DNA substrate. 

Trypsin is a serine protease and it 
cleaves at the C-terminal side of lysine and 

10 arginine residues. This is a very useful enzyme to 
study the domain structure of proteins and enzymes/ 
Trypsin digestion of Fokl in the presence of its 
substrate, d-5 ' -CCTCTGGATGCTCTC-3 ' (SEQ ID NO: 10) : 
5 9 -GAGAGCATCCAGAGG-3 9 (SEQ ID NO: 11) was carried out 

15 with an oligonucleotide duplex to Fokl molar ratio 
of 2.5:1. Fokl (200 fig) was incubated with the 
oligonucleotide duplex in a volume 180 /xl containing 
10 mM Tris.HCl, 50 mM NaCl, 10% glycerol and 10 mM 
MgCl 2 at RT for 1 hr. Trypsin (20 Ml/ 0.2 mg/ml) was 

20 added to the mixture. Aliguots (28 Ml) from the 
reaction mixture were removed at different time 
intervals and quenched with excess trypsin 
inhibitor, antipain. The tryptic fragments were 
purified by reversed-phase HPLC and their N-terminus 

25 sequence determined using an automatic protein 
sequenator from Applied Biosystems. 

The time course of trypsin digestion of 
Fokl endonuclease in the presence of 2.5 molar 
excess of oligonucleotide substrate and 10 mM MgCl 2 

30 is shown in Figure 4. At the 2.5 min time point 

only two major fragments other than the intact Fokl 
were present, a 41 kDa fragment and a 2S kDa 
fragment. Upon further trypsin digestion, the 41 
kDa fragment degraded into a 30 kDa fragment and 11 

35 kDA fragment. The 25 kDa fragment appeared to be 
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resistant to any further trypsin digestion. This 
fragment appeared to be less stable. if the trypsin 
digestion of Fokl - oligo complex was carried out in 
the absence of MgCl 2 . 
5 Only three major fragments (30 kDa, 25 kDa 

and 11 kDa) were present at the 160 min time point. 
Each of these fragments (41 kDa, 30 kDa, 25 kDa and 
11 kDa) was purified by reversed-phase HPLC and 
their N-terminal amino acid sequence were determined 
10 (Table I). By comparing these N-terminal sequences 
to the predicted sequence of Fokl, the 41 kDa and 25 
kDa fragments were identified as N-terminal and C- 
terminal fragments, respectively. In addition, the 
30 kDa fragment was N-terminal. 

15 Example V 

Isolation of DNA binding trvptic 
fragments of FoJel endonuclease using oliao 
dT-cellulose affi nity column . \ 
The DNA binding properties of the tryptic 

20 fragments were analyzed using an oligo dT-cellulose 
column. Fokl (160 /xg) was incubated with the 2.5 
molar excess oligonucleotide duplex [d-5'- 
CCTCTGGATGCTCTC(A) 15 -3 / (SEQ ID NO: 14): 
5 ' GAGAGCATCCAGAGG (A) 15 -3 9 (SEQ ID NO: 15)] in a volume 

25 of 90 Ml containing 10 mM Tris.HCl (pH 8), 50 mM 
NaCl, 10% glycerol and 10 mM MgCl 2 at RT for 1 hr. 
Trypsin (10 m!# 0.2 mg/ml) was added to the solution 
to initiate digestion. The ratio of trypsin to Fokl 
(by weight) was 1:80. Digestion was carried out 

30 for 10 min to obtain predominantly 41 kDa N-terminal 
fragment and 25 kDa C-terminal fragments in the 
reaction mixture. The reaction was quenched with 
large excess of antipain (10 /xg) and diluted in 
loading buffer [10 mM.Tris HCl (pH 8.0), 1 mM EDTA 

35 and 100 mM MgCl 2 ] to a final volume of 4 00 (xl. 
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The solution was loaded onto a oligo dT- 
cellulose column (0.5 ml, Sigma, catalog #0-7751) 
pre-equilibrated with the loading buffer. The 

i breakthrough was passed over the oligo dT-cellulose 

5 column six times. The column was washed with 5 ml 
of loading buffer and then eluted twice with 0.4 ml 
of 10 mM Tris.HCl (pH 8.0), 1 mM EDTA. These 
fractions contained the tryptic fragments that were 
bound to the oligonucleotide DNA substrate. The 

10 tryptic fragment bound to the oligo dT-cellulose 
column was analyzed by SDS-polyacrylamide gel 
electrophoresis. 

In a separate reaction, the trypsin 
digestion was carried out for 160 min to obtain 

15 predominantly the 30 kDa, 25 kDa and 11 kDa 
fragments in the reaction mixture. 

Trypsin digestion of FoJcI endonuclease for 
10 min yielded the 41 kDa N- terminal fragment and 25 
kDa C-terminal fragments as the predominant species 

20 in the reaction mixture (Figure 5, Lane 3). When 

this mixture was passed over the oligo dT-cellulose 
column, only the 41 kDa N-terminal fragment is 
retained by the column suggesting that the DNA 
binding property of FokT endonuclease is in the N- 

25 terminal 2/3 's of the enzyme. The 25 kDa fragment 
is not retained by the oligo dT-cellulose column. 

Trypsin digestion of Fokl - oligo complex 
for 160 min yielded predominantly the 30 kDa, 25 kDa 
and 11 kDa fragments (Figure 5, Lane 5). When this 

30 reaction mixture was passed over oligo dT-cellulose 
column, only the 30 kDa and 11 kDa fragments were 
retained. It appears these species together bind 
DNA and they arise from further degradation of 41 
kDa N-terminal fragment. The 25 kDa fragment was 

35 not retained by oligo dT-cellulose column. It also 
did not bind to DEAE and thus could be purified by 
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passage through a DEAE column and recovering it in 
the breakthrough volume. 

Fokl (390 Mg) was incubated with 2.5 molar 
excess of oligonucleotide duplex [d-5'- 
5 CTCTGGATGCTCT03 ' ( SEQ ID NO : 10 ) : 5 ' -GAGAGCATCCAGAGG- 
3'(SEQ ID NO:ll)] in a total volume of 170 Ml 
containing 10 mM Tris.HCl (pH 8), 50 mM NaCl and 10% 
glycerol at RT for 1 hr. Digestion with trypsin (30 
Ml; 0.2 mg/ml) in the absence of MgCl 2 was for 10 min 

10 at RT to maximize the yield of the 41 kDa N-terminal 
fragment. The reaction was quenched with excess 
antipain (200 Ml) • The tryptic digest was passed 
through a DEAE column. The 25 kDa of C-terminal 
fragment was recovered in the breakthrough volume. 

15 All the other tryptic fragments (41 kDa, 30 kDa and 
11 kDa) were retained by the column and were eluted 
with 0.5M NaCl buffer (3 x 200 Ml) • In a separate 
experiment, the trypsin digestion o£ Fokl -oligo 
complex was done in presence of 10 mM MgCl 2 ,at RT for 

20 60 min to maximize the yield of 30 kDa and 11 kDa 
fragments. This purified fragment cleaved non- 
specif ically both unmethylated DNA substrate 
(pTZ19R; Figure 6) and methylated DNA substrate 
(pACYCfoicJ/f) in the presence of MgCl 2 . These 

25 products are small, indicating that it is relatively 
non-specific in cleavage. The products were 
dephosphorylated using calf intestinal phosphatase 
and rephosphorylated using polynucleotide kinase and 
Iy- 32 ?] ATP. The 32 P-labeled products were digested 

30 to mononucleotides using DNase I and snake venom 

phosphodiesterase. Analysis of the mononucleotides 
by PEI-cellulose chromatography indicates that the 
25 kDa fragment cleaved preferentially 
phosphodiester bonds 5' to G>A»T~C. The 25 kDa C- 

35 terminal fragment thus constitutes the cleavage 
domain of Fokl endonuclease. 
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The 41 kDa N-terminal fragment - oligp 
complex was purified by agarose gel electrophoresis. 
Fokl endonuclease (200 /ig) was incubated with 2.5 
molar excess of oligonucleotide .dupiex, [d-5' 
CCTCTGGATGCTCTC-3 ' (SEQ ID NO: 10): 5'- 
GAGAGCATCCAGAGG-3 ' ( SEQ ID NO: 11)] in a volume of 180 
/il containing 10 mM Tris.HCl (pH 8.0) , 50 mM NaCl 
and 10% glycerol at RT for 1 hr. Tracer amounts of 
32 P-labeled oligonucleotide duplex was incorporated 
into the complex to monitor it during gel 
electrophoresis. Digestion with trypsin (20 Ml; 0.2 
mg'/ml) was for 12 min at RT to maximize the yield of 
the 41 kDa N-terminal fragment. The reaction was 
quenched with excess antipain. The 41 kDa N- 
terminal fragment - oligo complex was purified by 
agarose gel electrophoresis. The band corresponding 
to the complex was excised and recovered by 
electroelution in a dialysis bag 
(- 600 /il) . Analysis of the complex by SDS-PAGE 
revealed 41 kDa N-terminal fragment to be the major 
component. The 30 kDa N-terminal fragment and the 
11 kDa C-terminal fragment were present as minor 
components. These together appeared to bind DNA and 
co-migrate with the 41 kDa N-terminal fragment-oligo 
complex. 

The binding specificity of the 41 KDa N- 
terminal fragment was determined using gel mobility 
shift assays. 

Example yy 
Gel Mobility shift assays 
The specific oligos (d-S'-CCTCTGGATGCTCTC- 
3' (SEQ ID NO:10) and d-5 ' -GAGAGCATCCAGAGG-3 9 (SEQ ID 
NO: 11)) were 5'- 32 P-labeled in a reaction mixture of 
25 Ml containing 40 mM Tris.HCl(pH7 .5) , 20mM MgCl 2# 50 
mM NaCl # 10 mM DTT, 10 units of T4 polynucleotide 
kinase (from New England Biolabs) and 20 MCi[y- 32 P] 

32 
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ATP (3000 Ci/mmol) . The mixture was incubated at 
37 °C for 30 min. The kinase was inactivated by 
heating ttfe reaction mixture to 70°C for 15 min. 
After addition of 200 nl of water, the solution was 
5 passed through Sephadex G-25 (Superfine) column 

(Pharmacia) to remove the unreacted [y- 32 P] ATP. The 
final concentration of labeled single-strand dligos 
were 27 mM. * 

The single-strands were then annealed to 

10 form the duplex in 10 mM Tris.HCl (pH 8.0) , 50 mM 
NaCl to a concentration of 12 fM. 1 /xl of the 
solution contained - 12 picomoles of oligo duplex 
and - 50 x 10 3 cpm. The non-specific oligos (d-5'- 
TAATTGATTCTTAA-3 9 (SEQ ID NO: 12) and d-5'- 

15 ATTAAGAATCAATT-3 ' (SEQ ID NO: 13)) were labeled with ..' 
[y- 32 P]ATP and polynucleotide kinase as described 
herein. The single-stranded oligos were annealed to 
yield the duplex at a concentration of 12mM. 1 /il 
of the solution contained - 12 picomoles of oligo 

20 duplex and - 25 x 10 3 cpm. The non-specific oligos 
( d-5 ' -TAATTGATTCTTAA- 3 ' (SEQ ID NO: 12) and d-5'- 
ATTAAGAATCAATT-3 ' (SEQ ID NO: 13) ) were labeled with 
[y- 32 P] ATP and polynucleotide Kinase as described 
herein. The single-strand oligos were annealed to 

25 yield the duplex at a concentration of 12jxM. 1 jtl of 
the solution contained 42 picomdes of oligo duplex 
and -25xl0 3 cpm. 

10 Ml of 41 kDa N-terminal f ragment-oligo 
complex (- 2 pmoles) in 10 mM Tris.HCl, 50 mM NaCl 

30 and 10 mM MgCl 2 was incubated with 1 pi of 32 P- 
labeled specific oligonucleotide duplex (or 32 P- 
labeled non-specific oligonucleotide duplex) at 37 °C 
for 30 min and 120 min respectively. 5 Ml of 75% 
glycerol was added to each sample and loaded on a 8% 

35 nondenaturing polyacrylamide gel. Electrophoresis 
was at 300 volts in TBE buffer until bromophenol 
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blue moved - 6 cm from the top of the gel. The gel 
was dried H and autoradiographed. 

The complex readily exchanged 32 P-labeled 
specific oligonucleotide duplex that contained the 
Fokl recognition site as seen from the gel mobility 
shift assays (Figure 7). It -id not, however, 
exchange the 32 P-labeled non-specific oligonucleotide 
duplex that did not contain the Fokl recognition 
site. These results indicate that all che 
information necessary for sequence-specific 
recognition of DNA are encoded within the 41 kDa N- 
terminal fragment of Fokl. 

Example Vjl 
Analysis of Fokl bv trypsin cleavage 

ii> the absence of DNft gybstrate. 

A time course of trypsin digestion of Fokl 
endonuclease in the absence of the DNA substrate is 
shown in Figure 8. Initially, Fokl cleaved into a 
58 kDa fragment and a 8 kDa fragment. The 58 kDa 
fragment did not bind DNA substrates and is not 
retained by the oligo dT-cellulose column. On 
further digestion, the 58 kDa fragment degraded into 
several intermediate tryptic fragments. However, 
the complete trypsin digestion yielded only 25 kDa 
fragments (appears as two overlapping bands) •• 

Each of these species (58 kDa, 25 kDa and 
8 kDa) were purified by reversed phase HPLC and 
their amino terminal amino acid sequence determined 
(Table I). Comparison of the N-terminal sequences 
to the predicted Fokl sequence revealed that the 8 
kDa fragment to be N-terminal and the 58 kDa 
fragment to be C-terminal. This further supports 
the conclusion that N-terminus of Fokl is 
responsible for the recognition domain. Sequencing 
the N-terminus of the 25 kDa fragments revealed the 
presence of two different components. A time course 
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of trypsin digestion of Fokl endonuclease in a the 
presence of a non-specific DNA substrate yielded a 
profile similar to the one obtained when trypsin 
digestion of Fokl is carried out in absence of any 
DNA substrate. 

Example VIII 
Cleavage specificity of the 25 kDa 
C-terminal trvptic fragment of Fokl 
The 25 kDa C-terminal tryptic fragment of 
Fokl cleaved pTZ19R to small products indicating 
non-specific cleavage. The degradation products 
were dephosphorylated by calf intestinal phosphatase 
and -labeled with the polynucleotide kinase and 
[r-^PJATP. The excess label was removed using a 
Sephadex G-25 (Superfine) column. The labeled 
products were then digested with 1 unit of 
pancreatic DNase I (Boehringer-Mannheim) in buffer 
containing 50 mM Tris.HCl(pH7.6) , lOmM MgCl 2 at 37°C. 
for 1 hr. Then, 0.02 units of snake venom 
phosphodiesterase was added to the reaction mixture 
and digested at 37 °C for 1 hr. 

Example IX 
Functional domains in Fokl 
restriction endonuclease. 
Analysis of functional domains of Fokl (in 
the presence and absence of substrates) using 
trypsin was summarized in Figure 9. Binding of DNA 
substrate by FoJcI was accompanied by alteration in 
the structure of the enzyme. This study supports 
that presence of two separate protein domains within 
this enzyme: one for sequence-specific recognition 
and the other for endonuclease activity. The 
results indicate that the recognition domain is at 
the N-terminus of the Fokl endonuclease, while the 
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cleavage domain is probably in , the C-terminus third 
of the molecule. 

Examples Relating to Construction 

of insertion Mutants (X-XIV) 
The complete nucleotide sequence of the 
Fokl RM system has been published by various 
laboratories (Lopney et al., Gene 80: 193-208, 1989 
& Kita et al., J. Biol.Chem. 264: 5751-56, 1989). 
Experimental protocols for PCR are described, for 
example, in Skoglund et al., Gene 88:1-5, 1990 and 
in Bassing et al., Gene 113:83-88, 1992. The 
procedures for cell growth and purification of the 
mutant enzymes are similar to the ones used for the 
wild-type Fokl (Li et al., Proc. Nat'l. Acad. Sci. 
USA 89:4275-79, 1992). Additional steps which 
include Sephadex G-75 gel filtration and Heparin- 
Sepharose CL-6B column chromatography were necessary 
to purify the mutant enzymes to homogeneity. 

Example X 

Mutaaensis of Spel Site at Nucleotide 
162 within the fokIR Gene 
The two step PCR technique used to 
mutagenize one of the Spel sites within the fokIR 
gene is described in Landt et al., Gene 96: 125-28, 
1990. The three synthetic primers for this- protocol 
include: 1) the mutagenic primer (5'-TCATAA 
. TAGCAAC3^J^TTCTTTTTGGATCTT-3 9 ) (see SEQ ID NO: 24) 
containing one base mismatch within the Spel site; 
2) the other primers each of which are flanked by 
restriction sites Clal ( 5 ' -CC^ICGATATAGCCTTTTTTATT- 
3') (see SEQ ID NO:25) and Xbal (5'- 
GC TCTAGA GGATCCGGAGGT-3 ' ) (see SEQ ID NO: 26), 
respectively. An intermediate fragment was 
amplified using the Xbal primer and the mutagenic 
primer during the first step. The Clal primer was 
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* then added to the intermediate for the second step 
*PCR. The final 0.3 kb PCR product was digested with 
Xbal/Clal to generate cohesive ends and gel- 
» purified. The expression vector (pXRSfoklR) was 

5 cleaved with Xbal/Clal. The large 4.2 kb fragment 
was then gel-purified and ligated to the PCR 
fragment. The recombinant DNA was transfected into 
competent Z. coli RRl[pkCYCfokIM] cells. After 
tetracycline and ampicillin antibiotic selection 
10 several clones were picked, and their plasmid DNA 

was examined by restriction analysis. The Spel site 
mutation was confirmed by sequencing the plasmid DNA 
using Sanger's sequencing method (Sanger et ah 
Proc. Natl. Acad. Sci. USA 74: 5463-67, 1977). 

15 Example XI 

Construction of fo ur for seven) codon 
Insertion Mutants 
The PCR-generated DNA containing a four 
(or seven) codon insertion was digested with a 

20 Spel/Xmal and gel-purified. The plasmid, pRRS/oJtJJ? 
from Example X was cleaved with Spel/Xmal, and the 
large 3.9 kb fragment was gel-purified and ligated 
to the PCR product. The recombinant DNA was 
transfected into competent RRl[pACYCfoJtIW] cells, 

25 and the desired clones identified as described in 
Example X. The plasmids from these clones were 
isolated and sequenced to confirm the presence of 
the four (or seven) codon insertion within the fokIR 
gene. 

30 in particular, the construction of the 

mutants was performed as follows: (1) There are 
two Spel sites at nucleotides 162 and 1152, 
respectively, within the fokIR gene sequence. The 
site at 1152 is located near the trypsin cleavage 

35 site of Fokl that separates the recognition and 

cleavage domains. In order to insert the four (or 
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' seven) codons around this region, the other Spel 
site at 162 was mutagenized using a two step PGR 
technique (Landt et al. Gene 96:125-28, 1990). 
Introduction of this Spel site mutation in the i okIR 
gene does not affect the expression levels of the 
overproducer clones. (2) The insertion of four (or 
seven) codons was achieved using the PCR technique. 
The mutagenic primers used in the PCR amplification 
are shown in Figure 11. Each primer has a 21 bp 
complementary sequence to the fokIR gene. The 5' 
end of these primers are flanked by Spel sites. The 
codons for KSEL and KSELEEK repeats are incorporated 
between the Spel site and the 21 bp complement. 
Degenerate codons were used in these repeats to 
circumvent potential problems during PCR 
amplification. The other primer is complementary to 
the 3' end of the fokIR gene and is flanked by a 
Xmal site. The PCR-generated 0.6 kb fragments 
containing the four (or seven) codon inserts 
digested with Spel /Xmal and gel-purified. These 
fragments were substituted into the high expression 
vector pRRSfofclR to generate the mutants. Several 
clones of each mutant identified and their DNA 
sequence confirmed by Sanger's dideoxy chain 
termination methoa (Sanger et al. Proc. Natl. Acad, 
ScAt y?A 74.5463-67 1977). 

Upon induction with 1 mM isopropyl 6-D- 
thiogalactoside (IPTG) , the expression of mutant 
enzymes in these clones became most prominent at 3 
hrs as determined by SDS/PAGE. This was further 
supported by the assays for the enzyme activity. 
The levels of expression of the mutant enzymes in 
these clones were much lower compared to the wild- 
type Fokl. IPTG induction for longer times resulted 
in lower enzyme levels^ indicating that the mutant 
enzymes were actively degraded within these clones. 
This suggests that the insertion of four (or seven) 
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codons between the recognition and cleavage domains 
of Fokl destabilizes the protein conformation making 
them more susceptible to degradation within the 
cells. SDS/PAGE profiles of the mutant enzymes are 
shown in Figure 12. 

Example XII 
Preparat ion of DNA Substrates with 

a Single Fokl Site 
Two substrates, each containing a single 
Fokl recognition site, were prepared by PCR using 
pTZ19R as the template. Oligonucleotide primers, 
5 9 -CGCAGTGTTATCACTCAT-3 ' and 5 ' -CTTGGTTGAGTACTCACC- 
3' (see SEQ ID NO:27 and SEQ ID NO:28, 
respectively) , were used to synthesize the 100 bp 
fragment. Primers, 5 ' -ACCGAGCTCGAATTCACT-3 9 and 5'- 
GATTTCGGCCTATTGGTT-3 ' (see SEQ ID NO: 29 and SEQ ID 
NO: 30, respectively), were used to prepare the 256 
bp fragment. Individual strands within these 
substrates were radiolabled by using the 
corresponding 32 P-labeled phosphorylated primers 
during PCR. The products were purified from low- 
melting agarose gel, ethanol precipitated and 
resuspended in TE buffer. 

Example YJJX 
Analysis of the Sequence Specificity 
of the Mutant Enzvmes 
The agarose gel electrophoretic profile of 
the cleavage products of pTZ19R DNA by Fokl and the 
mutants are shown in Figure 13 A. They are very 
similar suggesting that insertion of four (or seven) 
codons in the linker region between the recognition 
and cleavage domains does not alter its DNA sequence 
specificity. This was further confirmed by using 
32 P-labeled DNA substrates (100 bp and 256 bp) each 
containing a single Fokl site. Substrates 
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containing individual strands labeled with 32p were 
prepared as described in Example XII. Fokl cleaves 
the 256 bp substrate into two fragments, 180 bp and 
72 bp, respectively (Figure 13B) . The length of the 
5 fragments was calculated from the 32 P-labeled 5' end 
of each strand. The autoradiograph of the agarose 
gel is shown in Figure 13C. Depending on which 
strand carries the 32 P-label in the substrate, either 
72 bp fragment or 180 bp fragment appears as a band 

10 in the autoradiography The mutant enzymes reveal 
identical agarose gel profiles and autoradiograph. 
Therefore, insertion of four (or seven) codons 
between the recognition and cleavage domains does 
not alter the DNA recognition mechanism of Fokl 

15 endonuclease. 

Example X^V 
Analysis of the Cleavage Distances from the 
Recognition Site bv the Mutant Enzymes 

To determine the distance of cleavage by 

20 the mutant enzymes, their cleavage products of the 
32 P-labeled substrates were analyzed by PAGE (Figure 
14). The digests were analyzed alongside the 
sequencing reactions of pTZ19R performed with the 
same primers used in PGR to synthesize these 

25 substrates. The cleavage pattern of the 100 bp 

fragment by Fo*I and the mutants are shown in Figure 
14 A. The cut sites are shifted from the recognition 
site on both strands of the substrates in the case 
of the mutants, as compared to the wild-type enzyme. 

30 The small observable shifts between the sequencing 
gel and the cleavage products are due to the. 
unphosphorylated primers that were used in the 
sequencing reactions. 

On the 5'-GGATG-3' strand, both mutants 

35 cut the DNA 10 nucleotides away from the site while 
on the 5'-CATCC-3' strand they cut 14 nucleotides 
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i 

avay from the recognition site.. These appear to be 
the major cut sites for both the mutants. A small 
amount of cleavage similar to the wild-type enzyme 
was is also observed. 
5 The cleavage pattern of the 256 bp 

fragment is shown in Figure 14B. The pattern of 
cleavage is shown in Figure 14B. The pattern of 
cleavage is similar to the 100 bp fragment. Some 
cleavage is seen 15 nucleotides away from the 

10 recognition site on the 5'-CATCC-3' strand in the 

case of the mutants. The multiple cut sites for the 
mutant enzymes could be attributed to the presence 
of different .conformations in these proteins. Or 
due to the increased flexibility of the spacer 

15 region between the two domains. Depending on the 
DNA substrate, some variation in the intensity of 
cleavage at these sites was observed. This may be 
due to the nucleotide sequence around these cut 
sites. Naturally occurring Type IIS enzymes with 

20 multiple cut sites have been reported (Szybalski et 
al., Gene 100:13-26, 1991). 

Examples Relating to Constr uction of the 
Hybrid Enz yme Ubx-F g (XV-XV11) 
As noted above, the complete nucleotide 
25 sequence of the Fokl restriction-modification system 
has been published by other laboratories (Kita et 
al., J. Biol Chem. 264:5751-56 (1989); Looney et 
al., Gene 80:193-208 (1989)). Experimental 
protocols for PCR are described elsewhere (Skoglund 
30 et al., Gene 88 : 1-5 (1990) ) . The procedures for 

cell growth and purification of proteins using His- 
bind To resin is as outlined in Novagen pET system 
manual. Additional steps, which include 
phosphocellulose and DEAE column chromatography, 
35 were necessary to purify the hybrid protein, Ubx-F u 
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to near homogeneity. The protocol for SDS/PAGE is 
as described by Laemmli (Nature 222:680-685 (1970)). 

Preparation of pUC13 derived substrates; . 

pUC13 derived DNA substrates were prepared 
5 by blunt-end ligation of Smal-cleaved pUC13 plasmid 
with ten-fold excess of a 30 bp insert containing a 
known Ubx site, 5 ' -TTAATGGTT-3 ' . Several clones 
were picked and their plasmid DNA were analyzed for 
the presence of 30 bp inserts. Clones containing 
10 pUC13(l), pUC13(2) or pUC13(3), each with 1, 2 and 3 
inserts respectively , were identified. Their DNA 
sequences were confirmed by Sanger's dideoxy 
sequencing method (Proc. Natl. Acad. Sci. USA 
74:5463-67 (1977) . 

15 Preparation of DNA substrates with 

a single Ubx site; 
The polylinker region of pUC13(l) which 
has a single 30 bp insert was excised using 
.EcoRI/J-Tindlll and gel-purified. Individual stands 

20 of his substrate were radiolabeled by using 32 P-dATP 
or 32 P-dCTP and filling in the sticky ends of the 
fragment with Klenow enzyme. The products were 
purified from low-melting agarose gel, ethanol- 
precipitated, and resuspended in the buffer (10 mM 

25 Tris.HCl/1 mM EDTA, pH 8.0). 

Eicamplq XV 
Construction of the Clone Producing 
tfre flyfririd Enzyme, Ufix-F^ Usi^nq PCR 
The homeo domain of Ubx, a 61 amino acid 
30 protein sequence encoded by the homeobox of Ubx is a 
sequence-specific DNA-binding domain with a 
structure related to helix-turn-helix motifs found 
in bacterial DNA-binding proteins (Hayashi et al., 
Cell 63:883-^94 (1992); Wolberger et al., Cell 
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'67:517-28 (1991). The Ubx homeo domain recognizes 
the 9 bp consensus DNA sites, 5'-TTAAT (G/T) (G/A) 
CC-3' (Ekker et al. , The EMBO Journal 10:1179-86 

(1991) ; Ekker et al., The EMBO Journal 11:4059-4702 

(1992) ). The present inventors used the PCR 
technique to link the Ubx homeo domain to the 
cleavage domain (F N ) of Fokl and to express the Ubx- 
F„ enzyme in E. coli. A schematic representation of 
the engineered Ubx-F u hybrid protein is shown in Fig, 
16. The oligonucleotide primers. used to construct 
the hybrid gene is shown in Fig. 17A. 

Construction of the clone expressing the 
hybrid protein was done as follows: First, the PCR- 
generated Ubx homeo box was digested with Pstl/Spel 
and gel-purified. This fragment was then 
substituted into the vector pRRSfo/cIJ? to replace the 
DNA segment coding for the Fokl DNA-binding domain 
and, hence, form the Ubx-F^ hybrid gene (Fig. 17B) . 
After transfection of competent RRl cells with the 
ligation mix, several clones were identified by 
restriction analysis and their DNA sequences were 
confirmed by the dideoxy chain-termination method of 
Sanger et al. (Proc. Natl. Acad. Sci. usa 74:5463-67 
(1977)). Second, the hybrid gene was amplified 
using the Ubx-F H primers. The PCR-generated DNA was 
digested with Ndel/BamHI and gel-purified. This 
fragment was then ligated into the tfdel/BamHI- 
cleaved pET-15b vector. This construct will tag the 
hybrid protein with 6 consecutive histidine residues 
at the N-terminus. These serve as the affinity tag 
for purification of this protein by metal chelation 
chromatography using No vagen's His-bind™ resin. 
This His tag can be subsequently removed by 
thrombin. Competent BL2l(DE3) cells were 
transformed with the ligation mix and several clones 
containing the recombinant DNA (Fig. 17B) were 
identified. These colonies were sick and grew 
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poorly in culture with a doubling time of about 45 
minutes. After induction with 1 mM isopropyl-0-D- • 
thiagalactoside (IPTG) , the hybrid enzyme was 
purified to homogeneity using His-bind™ resin, 
5 phosphocellulose and gelrchromatography. The 

SDS/PAGE profile of the purified hybrid enzyme is 

i' 

shown in Fig. 18. The identity of the hybrid 
protein was further confirmed by probing the Western 
blot with rabbit antisera raised against Fokl 
10 endonuclease (data not shown) . 

Example xyi 
Analysis of the D NA Sequence Preference 
of the Ubx-F^ Hybrid Enzvme 
The linearized pUC13 derived substrates 

15 used to characterize Ubx-F H are shown in Fig. 19. 

The derivatives were constructed by inserting a 30 
bp DNA fragment containing a known Ubx recognition 
sequence 5 9 -TTAATGGTT-3 ' at the Sma I site of pUC13. 
Cleavage at the inserted Ubx site should yield -1.8 

20 kb and -0.95 kb fragments as products. The agarose 
gel electrophoretic profile of the partial digests 
of the substrates by Ubx-F^ is shown in Fig. 19. In 
these reactions, the molar ratio of DNA was in large 
excess compared to the protein. The reaction 

25 condition was optimized to give a single double- 
stranded cleavage per substrate molecule. The 
linearized pUC13 DNA is cleaved into four fragments. 
The appearance of four distinct bands in the agarose 
gel electrophoretic profile indicates that Ubx-F^ 

30 binds DNA in a sequence-specific manner, and that 
there are two binding sites within the linearized 
pUC13 for the hybrid protein. This is further 
supported by the fact that the linearized pUC13 DNA 
substrate containing a single Ubx site is cleaved 

35 into six fragments. The two additional fragments 
(-1.8 kb and -0.95 kb, respectively) could be 
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explained as resulting from the binding of the : 
hybrid protein at the newly inserted Ubx site of 
PVC13 and cleaving near this site. As expected, the 
intensity of the bands increases with the number of 
30 bp inserts in pUC13. The two putative Ubx 
binding sites in pUC13 and the inserted Ubx site are 
shown in Table 3 below. All these sites have 5'- 
TAAT-3' as their core sequence; and these preferred 
sites are consistent with those reported for the Ubx 
homeo domain. The affinity of Ubx homeo domain for 
these sites is modulated by the nucleotide bases 

i 

surrounding the core site. It appears that the 
hybrid protein does turnover, since complete 
digestion is observed at longer time period or by 
increasing the protein concentration (data not 
shown) . The cleavage is more specific at higher 
temperatures . 

Example XVII 
Analysis of the Cleavage Distance from the 
Recognition Site by the Hybrid Enzvme 

To determine the distance of cleavage from 
the recognition site by Ubx-F H , the cleavage products 
of the 32 P-labeled DNA substrates containing a single 
Ubx site were analyzed by PAGE (Fig. 20). The 
digestion products were analyzed alongside the 
Maxam-Gilbert's (G + A) sequencing reactions of the 
substrates. As expected, the cut sites are shifted 
away from the recognition site. On the 5'-TAAT-3' 
strand, Ubx-F H cuts the DNA 3 nucleotides away from 
the recognition site while on the 5'-ATTA-3' strand 
it cuts 8, 9 or 10 nucleotide away from the 
recognition site. Analysis of th cut sites of Ubx-F u 
based on the cleavage of the DNA substrate 
containing a single Ubx_ site is summarized in Fig. 
20. The cleavage occurs 5' to the TAAT sequence and 
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is consistent with the way the Vbx-F^ hybrid protein 
was engineered (Fig. 16) . 

TABLE 1 

Amino-terminal sequences of Fokl 
fragments from trypsin digestion 



Fragment Amino-terminal sequence DNA SEQ ID 

substrate NO 



8 kDa 


VSKIRTFG*VQNPGKFENLKRWQVFDRS - 


16 


58 kDa 


SEAPCDAIIQ 


17 


25 kDa 


QLVKSELEEK + 


18 


41 kDa 


VSKIRTFGWV 


19 


30 kDa 


VSKIRTFGWV 


19 


11 kDa 


FTRVPKRVY 


20 



n. 
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TABLE 3 
£723X-binding Sites in pUC13 



PCT/US94/09143 



Sequence Remarks 

5 ' -TTAATGTCA-3 ' putative Ubx .' 

sites present in 



5 ' -TTAATGAAT-3 ' 



pUC13 



5 ' -TTAATGGTT-3 ' vbx site inserted 

at the Sinai site 
of pUC13 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Chandrasegaran, Srinivasan 



(ii) TITLE OF INVENTION: Functional Domains in Fokl 

Restriction Endonuc lease 

(iii) NUMBER OF SEQUENCES : 48 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Cushman, Darby & Cushman 

(B) STREET: 1100 New York Ave., N.W. 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP: 20005-3918 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, 
Version #1.25 „. 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/126,564 

(B) FILING DATE: 27-SEPTEMBER-93 

(C) CLASSIFICATION: 

(vlii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Kokulis, Paul N. 

(B) REGISTRATION NUMBER: 16,773 

(C) REFERENCE/ DOCKET NUMBER: PNK/4130/82506/CLB 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-861-3503 

(B) TELEFAX: 202-822-0944 

(C) TELEX: 6714627 CUSH 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l 
GGATG , 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE. TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CCTAC 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 18.. 3 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCATGGAGGT TTAAAAT ATG AGA TTT ATT GGCAGC 

Met Arg Phe lie Gly Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



55 



WO 95/09233 PCT/OS94/09143 

(Xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 4: 
Met Arg Phe lie Gly Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATACCATGGG AATTAAATGA CACAGCATCA 30 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 22.. 42 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TAGGATCCGG AGGTTTAAAA T ATG GTT TCT AAA ATA AGA ACT 42 

Met Val Ser Lys lie Arg Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Val Ser Lys lie Arg Thr 7 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
, (A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

i 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TAGGATCCTC ATTAAAAGTT TATCTCGCCG TTATT 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Asn Asn Gly Glu lie Asn Phe 
1 5 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CCTCTGGATG CTCTC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GAGAGCATCC AGAGG 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TAATTGATTC TTAA 14 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ATTAAGAATC AATT 14 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
■■(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
CCTCTGGATG CTCTCAAAAA AAAAAAAAAA 30 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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15 
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• ' (0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
( (xi)' SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GAGAGCATCC AGAGGAAAAA AAAAAAAAAA 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Val Ser Lys lie Arg Thr Phe Gly Xaa Val Gin Asn Pro Gly Lys 
1 5 10 15 

Phe Glu Asn Leu Lys Arg Val Val Gin Val Phe Asp Arg Ser 
20 25 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ser Glu Ala Pro Cys Asp Ala lie lie Gin io 

1 5 io 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gin Leu Val Lys Ser Glu Leu Glu Glu Lys 10 
1, 5 10 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Val Ser Lys lie Arg Thr Phe Gly Trp Val 10 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Phe Thr Arg Val Pro Lys Arg Val Tyr 9 
1 5 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Glu Glu Lys 3 
1 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
Lys Ser Glu Leu 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Lys Ser Glu Leu Glu Glu Lys 
1 5 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TAGCAACTAA TTCTTTTTGG ATCTT 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

61 



WO 95/09233 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
CCATCGATAT AGCCTTTTTT ATT 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

i 

GCTCTAGAGG ATCCGGAGGT 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CGCAGTGTTA TCACTCAT 



(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CTTGGTTGAG TACTCACC 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
ACCGAGCTCG AATTCACT 18 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

. (ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



GATTTCGGCC TATTGGTT 18 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Met Val Ser Lys lie Arg Thr Phe Gly Trp Val Gin Asn Pro Gly 
1 : 5 10 15 

Lys Phe Glu Asn Leu Lys Arg Val Val Gin Val Phe Asp Arg Asn 

20 25 30 

Ser Lys Val His Asn Glu Val Lys Asn He Lys He Pro Thr Leu 

35 40 45 

Val Lys Glu Ser Lys He Gin Lys Glu Leu Val Ala He Met Asn 

50 55 60 

Gin His Asp Leu He Tyr Thr Tyr Lys Glu Leu Val Gly Thr Gly 

65 70 * 75 

Thr Ser He Arg Ser Glu Ala Pro Cys Asp Ala lie lie Gin Ala 

80 85 90 

Thr lie Ala Asp Gin Gly Asn Lys Lys Gly Tyr He Asp Asn Trp 

95 100 " 105 

Ser Ser Asp Gly Phe Leu Arg Trp Ala His Ala Leu Gly Phe He 
110 * 115 * 120 

Glu Tyr lie Asn Lys Ser Asp Ser Phe Val lie Thr Asp Val Gly 
125 130 135 

Leu Ala Tyr Ser Lys Ser Ala Asp Gly Ser Ala He Glu Lys Glu 
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140 145 150 

lie Leu lie Glu Ala lie Ser Ser Tyr Pro Pro Ala lie Arg lie 
» 155 160 165. 

Leu Thr Leu Leu Glu Asp Gly Gin His Leu Thr Lys Phe Asp Leu 

170 175 . 180 

Gly Lys Asn Leu Gly Phe Ser Gly Glu Ser Gly Phe Thr Ser Levi 

185 190 195 

Pro Glu Gly lie Leu Leu Asp Thr Leu Ala Asn Ala Met Pro Lys 

200 205 ' 210 

Asp Lys Gly Glu lie Arg Asn Asn Trp Glu Gly Ser Ser Asp Lys 

215 220 225 

Tyr Ala Arg Met lie Gly Gly Trp Leu Asp Lys Leu Gly Leu Val 

230 235 240 

Lys Gin Gly Lys Lys Glu Phe lie lie Pro Thr Leu Gly Lys Pro 

245 250 255 

Asp Asn Lys Glu Phe lie Ser His Ala Phe Lys lie Thr Gly Glu 

260 265 270 

Gly Leu Lys Val Leu Arg Arg Ala Lys Gly Ser Thr Lys Phe Thr 

275 280 285 

Arg Val Pro Lys Arg Val Tyr Trp Glu Met Leu Ala Thr Asn Leu 

290 295 300 

Thr Asp Lys Glu Tyr Val Arg Thr Arg Arg Ala Leu lie Leu Glu 

305 310 315 

lie Leu lie Lys Ala Gly Ser Leu Lys lie Glu Gin lie Gin Asp 

320 325 330 

Asn Leu Lys Lys Leu Gly Phe Asp Glu Val lie Glu Thr lie Glu 

335 340 345 

Asn Asp lie Lys Gly Leu lie Asn Thr Gly lie Phe ( Ile Glu lie 

350 355 360 

Lys Gly Arg Phe Tyr Gin Leu Lys Asp His lie Leu Gin Phe Val 

365 370 375 

lie Pro Asn Arg Gly Val Thr Lys Gin Leu Val Lys Ser Glu Leu 

380 385 390 

Glu Glu Lys Lys Ser Glu Leu Arg His Lys Leu Lys Tyr Val Pro 

395 400 405 

His Glu Tyr lie Glu Leu lie Glu lie Ala Arg Asn Ser Thr Gin 

410 415 420 

Asp Arg lie Leu Glu Met Lys Val Met Glu Phe Phe Met Lys Val 

425 430 435 

Tyr Gly Tyr Arg Gly Lys His Leu Gly Gly Ser Arg Lys Pro Asp 

440 445 450 

Gly Ala lie Tyr Thr Val Gly Ser Pro He Asp Tyr Gly Val lie 

455 460 465 

Val Asp Thr Lys Ala Tyr Ser Gly Gly Tyr Asn Leu Pro He Gly 

470 475 480 

Gin Ala Asp Glu Met Gin Arg Tyr Val Glu Glu Asn Gin Thr Arg 

485 490 495 

Asn Lys His He Asn Pro Asn Glu Trp Trp Lys Val Tyr Pro Ser 

500 505 510 

Ser Val Thr Glu Phe Lys Phe Leu Phe Val Ser Gly His Phe Lys 

515 520 525 

Gly Asn Tyr Lys Ala Gin Leu Thr Arg Leu Asn His He Thr Asn 

530 535 540 

Cys Asn Gly Ala Val Leu Ser Val Glu Glu Leu Leu He Gly Gly 

545 550 555 

Glu Met He Lys Ala Gly Thr Leu Thr Leu Glu Glu Val Arg Arg 
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560 565 570 

Lys Phe Asn Asn Gly Glu lie Asn Phe 
575 

I 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Lys Gin Leu Val Lys Ser Glu Leu Glu Glu Lys 11 
1 5 ,10 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AAGCAACTAG TCAAAAGTGA ACTGGAGGAG AAG 33 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Leu Val Lys Ser Glu Leu Lys Ser Glu Leu Glu Glu Lys 
1 5 .10 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
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i 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TpPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GGACTAGTCA AATCTGAACT TAAAAGTGAA CTGGAGGAGA AG , 42 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Leu Val Lys Ser Glu Leu Glu Glu Lys Lys Ser Glu Leu Glu 
1 5 10 

Glu Lys 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: . 
GGACTAGTCA AATCTGAACT TGAGGAGAAG AAAAGTGAAC TGGAGGAGAA G 51 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(kij SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
Asn Phe Xaa Xaa 



(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
TTGAAAATTA CTCCTAGGGG CCCCCCT 27 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GGATGNNNNNNNNNNNNNNNNNN 23 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) topology: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

TACCTGCAGC GGAGGTTTAA AAT ATG CGA AGA CGC GGC CGA 41 

Met Arg Arg Arg Gly Arg 
1 5 



(2) INFORMATION FOR SEQ ID NO: 42: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 base pairs , 

(B) TYPE: nucleic acid 

(C) S.TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

T TAC TTC GAC TTC TTC CTC TAG GTT GAT CAG AT 33 
Met Lys Leu Lys Lys Glu lie Gin Leu Val 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

CCA CGG CAT ATG CGA AGA CGC GGC CGA 27 
Met Arg Arg Arg Gly Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

TTA TTG CCG CTC TAT TTG AAA ATT ACT CCTAGG AT 35 
Asn Asn Gly Glu lie Asn Phe 
1 5 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
AGAGGAGGTA ATGGG 



(2) INFORMATION FOR SEQ ID NO: 46: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

i 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
ATTAAGGGGG GAAGAG 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) . TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
CTCTAGAGGA TCCCCGCGCT TAATGGTTTT TGC 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
GAGATCTCCT AGGGGCGCGA ATTACCAAAA ACG 
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***** 

All publications mentioned hereinabove are 
hereby incorporated by reference. 

While the foregoing invention has been 
5 described in some detail for purposes of clarity and 
understanding, it will be appreciated by one skilled 
in the art that various changes in form and detail 
can be made without departing from the true scope of 
the invention. 



70 



WO 95/09233 PCT/US94/09143 
WHAT IS CLAIMED IS; 

, 1. An isolated DNA segment encoding the 

1 recognition domain of a Typ=; IIS endonuclease which 
contains the sequence-specific recognition activity 
5' of said Type IIS endonuclease, 

2. The DNA segment of claim 1 wherein 
said Type IIS endonuclease is FoJcI restriction 
endonuclease. 

3 . The DNA segment of claim 2 wherein the 
10 encoded protein has a molecular weight of about 41 

kilodaltons as determined by SDS polyacrylamide gel 
electrophoresis. 

4. The DNA segment of claim 3 which 
encodes amino acids 1-382 of the Fokl restriction 

15 endonuclease. 

5. An isolated DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease. 

20 6; The DNA segment of claim 5 wherein 

said Type IIS endonuclease is Fokl restriction 
endonuclease. 

7 . The DNA segment of claim 6 wherein the 
encoded protein has a molecular weight of about 25 

25 kilodaltons as determined by SDS-polyacrylamide gel 
electrophoresis. 

8. The DNA segment of claim 7 which 
encodes amino acids 383-578 of the Fokl restriction 
endonuclease. 
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9. An isolated protein consisting 
essentially of the N-terminus of the Fokl 



5 



restriction endonuclease which protein has the 
sequence-specific recognition activity of said 
endonuclease. 



10. The protein of claim 9 which is amino 
acids 1-382 of the FoJcI restriction endonuclease. 



11. An isolated protein consisting 
essentially of the C-terminus of the Fokl 



10 restriction endonuclease which protein has the 
cleavage activity of said endonuclease. 

12. The protein of claim 11 which is 
amino acids 383-578 of the Fokl restriction 
endonuclease. 

15 13. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

20 (ii) a second DNA segment encoding a 

sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 



and 



25 



(iii) a vector 

wherein said first DNA segment and said 
second DNA segment are operably linked to said 
vector so that a single protein is produced. 



30 



14. The DNA construct according to claim 
13 wherein said Type IIS endonuclease is Fokl 
restriction endonuclease*. 
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15. The DNA construct according to claim 
14 wherein said recognition domain is selected from 
the group consisting of : zinc finger motifs, homeo 
domain motifs , DNA binding domains of repressors, 

5 POU domain motifs (eukaryotic transcription 

regulators), DNA binding domains of oncogenes and 
naturally occurring sequence-specific DNA binding 
proteins that recognize >6 base pairs. 

16. The DNA construct according to claim 
10 15 wherein said recognition domain is the homeo 

domain of Ubx. 

17. A procaryotic cell comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 

15 contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type lis endonuclease; 

20 and 

(iii) a vector 

wherein said first DNA segment and said 
second DNA segment are operably linked to said 
vector so that a single protein is produced. 

25 18. The procaryotic cell of claim 17 

wherein said first DNA segment encodes the catalytic 
domain (F„) of Fokl, and said second DNA segment 
encodes the homeo domain of Ubx. 

19. A hybrid restriction enzyme 
30 comprising the catalytic domain of a Type IIS 

endonuclease which contains the cleavage activity of 
said Type IIS endonuclease covalently linked to a 
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recognition domain of an enzyme . other than said Type 
IIS endonuclease. 

20. The hybrid restriction enzyme of 
claim 19 wherein said recognition domain, which 
comprises part of said hybrid restriction .enzyme, is 
selected from the group consisting of: zinc finger 
motifs, homeo domain motifs, POU domain motifs, DNA 
binding domains of repressors, DNA binding domains 
of oncogenes and naturally occurring sequence-* 
specific DNA binding proteins that recognize >6 base 
pairs. 

21. , The hybrid restriction enzyme of 
claim 20 wherein said recognition domain is the 
homeo domain of Ubx. 

22. The hybrid restriction enzyme of 
claim 21 wherein said Type II endonuclease is FoJfcl 
restriction endonuclease and said hybrid enzyme is 
Ubx-F v 

23. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

(iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

( iv) a vector 

wherein said first DNA segment, said 
second DNA segment and said third DNA segment are 
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bperably linked to said vector so that a single 
protein is produced. 

1 24 . The DNA construct according to claim 

23 wherein said Type IIS endonuclease is Fokl 
5 restriction endonuclease. 

25. The DNA construct according to claim 

24 wherein said third DNA segment consists 
essentially of four codons. 

26. The DNA construct according to claim 
10 25 wherein said four codons of said, third DNA 

segment are inserted at nucleotide, 1152 of the gene 
encoding said endonuclease. 

27. The DNA construct according to claim 
24 wherein said third DNA segment Consists 

15 essentially of 7 codons. 

28. The DNA construct according to claim 
27 wherein said 7 codons of said third DNA segment 
are inserted at nucleotide 1152 of the gene encoding 
said endonuclease. 

20 29. The DNA construct according to claim 

24 wherein said recognition domain is selected from 
the group consisting of: zinc finger motifs, homeo 
domain motifs, POU domain motifs, DNA binding 
. domains of repressors, DNA binding domains of 

25 oncogenes and naturally occurring sequence-specific 
DNA binding proteins that recognize >6 base pairs. 

30. A procaryotic cell comprising: 
(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
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contains the cleavage activity of said Type IIS 
endonuc lease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 

5 recognition domain of said Type IIS endonuclease; 

(iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

10 (iv) a vector 

wherein said first DNA segment, said 
second DNA segment , and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 

15 31. The procaryotic cell of claim 30 

wherein said third DNA segment consists essentially 
of four codons. 

32. The procaryotic cell of claim 30 
wherein said third DNA segment consists essentially 

20 of seven codons. 

33. An isolated hybrid Type IIS 

: endonuclease produced by the procaryotic cell of 
claim 30. 

34. An isolated DNA segment encoding the 
25 N-terminus of a Type IIS endonuclease which contains 

the sequence-specific recognition activity of said 
Type II endonuclease, said Type II endonuclease 
being Fokl restriction endonuclease and having a 
molecular weight of about 41 kilodaltons as measured 
30 by SDS-polyacrylamide gel electrophoresis. 

35. An isolated DNA segment encoding the 
C-terminus of a Type IIS endonuclease which contains 
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the cleavage activity of said Type IIS endonuclease, 
said Type II endonuclease being Fokl restriction 
endonuclease and having a molecular weight of about 
25 kilodaltons as determined by SDS-polyacrylamide 

5 gel electrophoresis. 

■| 

36. An isolated protein consisting 
essentially of the N-terminus of the Fokl 
restriction endonuclease which protein has the 
sequence-specific recognition activity of said 

0 endonuclease and which protein is amino acids 1-382 
of said Fokl restriction endonuclease. 

37. An isolated protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease which protein has the 

5 nuclease activity of said endonuclease and which 
protein is amino acids 383-578 of said Fokl 
restriction endonuclease. 
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FIG. I 
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FIG. 5 
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FIG.6A 
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FIG. 8 
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FIG. 9 
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FIG. 12 
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FIG. I3A 
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FIG. I3B 
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FIG.I3C 



2 3 4 5 6 7 8 9 1011 



16/30 



WO 95/09233 PCT/US94/09143 




FIG. I4A 
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FIG. 15 A 

(A) wild-type Fokl 
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FIG. I5B 



(B) 4-codon insertion mutant 

'. 5'- GG ATG NNNNNNNNNNN^^ -3' 
3 - CCTAC NNNNNNNNNNNNNNNNNN -5' 
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FIG. I5C 

(C) 7-codon insertion mutant 
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FIG. I9A 



.1 i 



Seal 



1765 bp 



Smal 



HlntWl 
— 1— 



EcdRl 
—I — 



915 bp 



Seal 



P UC13(0) 



□ 



pUC13(l) 



m 



pUG13(2) 



pUC13(3) 



26/30 



WO 95/09233 PCT/US94/09143 



FIG. I9B 
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